#puzzle-based evaluation13/06/2025
Apple Study Exposes Critical Weaknesses in AI Reasoning Through Puzzle Tests
Apple researchers have uncovered fundamental weaknesses in large reasoning AI models through controlled puzzle evaluations, showing significant performance drops as task complexity increases.